4 research outputs found

    FloDB: Unlocking Memory in Persistent Key-Value Stores

    Get PDF
    Log-structured merge (LSM) data stores enable to store and process large volumes of data while maintaining good performance. They mitigate the I/O bottleneck by absorbing updates in a memory layer and transferring them to the disk layer in sequential batches. Yet, the LSM architecture fundamentally requires elements to be in sorted order. As the amount of data in memory grows, maintaining this sorted order becomes increasingly costly. Contrary to intuition, existing LSM systems could actually lose throughput with larger memory components. In this paper, we introduce FloDB, an LSM memory component architecture which allows throughput to scale on modern multicore machines with ample memory sizes. The main idea underlying FloDB is essentially to bootstrap the traditional LSM architecture by adding a small in-memory buffer layer on top of the memory component. This buffer offers low-latency operations, masking the write latency of the sorted memory component. Integrating this buffer in the classic LSM memory component to obtain FloDB is not trivial and requires revisiting the algorithms of the user-facing LSM operations (search, update, scan). FloDB's two layers can be implemented with state-of-the-art, highly-concurrent data structures. This way, as we show in the paper, FloDB eliminates significant synchronization bottlenecks in classic LSM designs, while offering a rich LSM API. We implement FloDB as an extension of LevelDB, Google's popular LSM key-value store. We compare FloDB's performance to that of state-of-the-art LSMs. In short, FloDB's performance is up to one order of magnitude higher than that of the next best-performing competitor in a wide range of multi-threaded workloads

    Fast and Robust Memory Reclamation for Concurrent Data Structures

    Get PDF
    In concurrent systems without automatic garbage collection, it is challenging to determine when it is safe to reclaim memory, especially for lock-free data structures. Existing concurrent memory reclamation schemes are either fast but do not tolerate process delays, robust to delays but with high overhead, or both robust and fast but narrowly applicable. This paper proposes QSense, a novel concurrent memory reclamation technique. QSense is a hybrid technique with a fast path and a fallback path. In the common case (without process delays), a high-performing memory reclamation scheme is used (fast path). If process delays block memory reclamation through the fast path, a robust fallback path is used to guarantee progress. The fallback path uses hazard pointers, but avoids their notorious need for frequent and expensive memory fences. QSense is widely applicable, as we illustrate through several lock-free data structure algorithms. Our experimental evaluation shows that QSense has an overhead comparable to the fastest memory reclamation techniques, while still tolerating prolonged process delays

    TRIAD: creating synergies between memory, disk and log in log structured key-value stores

    Get PDF
    We present TRIAD, a new persistent key-value (KV) store based on Log-Structured Merge (LSM) trees. TRIAD improves LSM KV throughput by reducing the write amplification arising in the maintenance of the LSM tree structure. Although occurring in the background, write amplification consumes significant CPU and I/O resources. By reducing write amplification, TRIAD allows these resources to be used instead to improve user-facing throughput. TRIAD uses a holistic combination of three techniques. At the LSM memory component level, TRIAD leverages skew in data popularity to avoid frequent I/O operations on the most popular keys. At the storage level, TRIAD amortizes management costs by deferring and batching multiple I/O operations. At the commit log level, TRIAD avoids duplicate writes to storage. We implement TRIAD as an extension of Facebook's RocksDB and evaluate it with production and synthetic workloads. With these workloads, TRIAD yields up to 193% improvement in throughput. It reduces write amplification by a factor of up to 4x, and decreases the amount of I/O by an order of magnitude

    Redesigning Persistent Key-Value Stores for Future Workloads, Hardware, and Performance Requirements

    No full text
    Cloud storage stacks are being challenged by new workloads, new hardware, and new performance requirements. First, workloads evolved from following a read-heavy pattern (e.g., a static web-page) to a write-heavy profile where the read:write ratio is closer to 1:1 (e.g., as in the Internet of Things). Second, the hardware is undergoing rapid changes. The divide between fine-grained volatile memory and slow block-level storage is rapidly being bridged by the emerging byte-addressable non-volatile memory devices and the fast block-addressable NVMe SSDs (e.g., Intel Optane NVMe SSDs). Third, performance requirements in storage systems now emphasize low tail latency, in addition to high throughput. This dissertation argues that existing storage systems, in particular persistent key-value stores (KVs), have fundamental limitations that do not allow them to fully meet these challenges. This dissertation proposes four new KVs designed for future hardware, workloads, and performance requirements. FloDB shows how to scale the throughput of KVs on servers with ample memory sizes of up to hundreds of GBs. TRIAD introduces novel techniques to reduce write amplification and to increase throughput in log-structured merge based KVs running on SSDs. SILK presents an I/O bandwidth scheduler to decrease tail latency in log-structured merge based KVs. Finally, KVell demonstrates that NVMe SSDs shift the performance bottleneck from I/O to CPU, invalidating an assumption that has underpinned all past storage system design. In line with this observation KVell then presents a new design for KVs that departs from the conventional wisdom of optimizing disk usage and instead optimizes CPU usage
    corecore